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Summary 

An algorithm for solving a large class of two- and three- 
dimensional nonseparable elliptic partial differential equations 
(PDE’s) is developed and tested. It uses a modified 
D’Yakanov-Gunn iterative procedure in which the relaxation 
factor is grid-point dependent. It is easy to implement and 
applicable to a variety of boundary conditions. It is also 
computationally efficient, as indicated by the results of 
numerical comparisons with other established methods. 
Furthermore the current algorithm has the advantage of 
possessing two important properties which the traditional 
iterative methods lack; that is, (1) the convergence rate is 
relatively insensitive to grid-cell size and aspect ratio, and (2) 
the convergence rate can be easily estimated by using the 
coefficient of the PDE being solved. 


Introduction 

Since the middle sixties, fast direct solvers (FDS’s) have 
been developed for the numerical solution of separable elliptic 
partial differential equations (PDE’s) (refs. 1 to 5). Based on 
Fourier analysis and cyclic reduction, FDS algorithms are most 
effective on a uniform rectangular grid. They can obtain the 
solution with efficiency far beyond the reach of traditional 
iterative procedures such as successive overrelaxation (SOR) 
methods. 

Generally, FDS algorithms are not directly applicable to an 
elliptic problem with either a computation domain of irregular 
shape or a nonseparable PDE. The limitation of the 
computation domain may be circumvented either by mapping 
the original domain onto a rectangular domain or by using the 
capacity matrix method (ref. 5). The limitation of a 
nonseparable PDE can be circumvented/ by a semidirect 
procedure, that is, an iterative procedure driven by an FDS. 
In this study, a new semidirect procedure is developed and 
used as an elliptic solver for both two-dimensional (2-D) and 
three-dimensional (3-D) problems. This new iterative 
procedure is easy to implement, computationally efficient, and 
applicable to a variety of boundary conditions. Furthermore 
it has the advantage of possessing two important properties 
which the traditional iterative methods lack; that is, (1) the 
convergence rate is insensitive to grid-cell size and aspect ratio, 
and (2) the convergence rate can be easily estimated with the 
coefficients of the PDE being solved. 


Many elliptic PDE’s can be expressed as 

Qu-h (1) 

where Q is a nonseparable second-order linear elliptic 
operator, u the dependent variable, and h a given source term. 
Equation (1) may be solved with the iterative procedure 

P(u n+ 1 - u n ) = - t(Qu" - h) (2) 

where n is the iteration number, r a nonzero relaxation factor, 
and P a separable elliptic operator, which can be directly 
inverted by an FDS. This procedure is a continuous analogue 
of the D’Yakanov-Gunn iterations (ref. 6) and was utilized 
by Concus and Golub (ref. 7) and Bank (ref. 8) in their works 
on the numerical solution of nonseparable elliptic equations. 
In the previous works involving iteration (2) the relaxation 
factor r is treated as a constant and the iteration is accelerated 
by an optimal choice of r. In the current report, a more 
efficient algorithm is obtained by using a spatially varying 
relaxation factor. 

The use of a local (spatially varying) relaxation factor in 
the current study is motivated by an earlier study of a 
semidirect procedure (ref. 9). In the previous study, the local 
convergence rate evaluated by using a simple von Neumann 
analysis, to a great extent, is consistent with the numerical 
results. Based on this observation, it becomes obvious that a 
local relaxation factor could be used in iteration (2) to optimize 
its local convergence rate. Recently, a similar idea was also 
used by Botta and Veldman (ref. 10) to develop their SOR- 
related local relaxation method. However, as shown later, 
there is an underlying reason which makes the use of a local 
relaxation factor in the current procedure particularly 
attractive. 

As shown by the work of Bank (ref. 8), iteration (2) can 
also be accelerated by choosing an operator P, which closely 
resembles the operator Q. Application of this technique, 
however, could be limited by the following considerations: 

(1) This technique may require the use of a general separable 
operator P. This, however, is computationally inefficient, since 
an FDS code for a general separable operator is about five 
times slower than one for the Laplacian V 2 (ref. 5). 

(2) To apply this technique, the FDS code for the operator 
P, generally, must be made to individual specifications. This 
may require a considerable effort. 

The preceding considerations lead us to choose P = V 2 or 
its equivalent in the current study. 
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In the section Analysis, the convergence rate of the central 
difference form of equation ( 2 ) is studied for a constant 
coefficient operator Q by assuming the iterative errors satisfy 
the periodic boundary conditions. The analysis is a rigorous 
version of the von Neumann analysis and its results are used 
to determine the optimal value of the relaxation factor t. In 
the section Local Relaxation, the results obtained in the section 
Analysis are extended to solve PDE’s with variable 
coefficients. In the section Numerical Evaluation, the current 
method is numerically evaluated with a variety of 2-D 
nonseparable elliptic PDE’s, In addition to the advantages 
noted previously, the results of this numerical evaluation 
indicate that the current procedure can be used to solve PDE’s 
with a cross-derivative term and that it works very well for 
many PDE’s with rapidly varying coefficients. 

Finally, in the section Application to a 3-D Flow Problem, 
the current procedure is incorporated into an Euler Solver 
(ref. 9) to obtain solutions for 3-D incompressible flows in 
a 180° turning channel. It is shown that this new procedure 
converges with a rate much higher than that reported in 
reference 9. The successive line overrelaxation (SLOR) 
calculations referred to in the section Numerical Evaluation 
were carried out by using a code developed by Shih-Hung 
Chang of Cleveland State University. 


Analysis 

As an initial step, iteration (2) is studied by assuming that 
7 is a constant and 


2 = a —5 + 2b 


dx 2 dxdy 



( 3 ) 


where a, b, and c are arbitrary constants subjected to the 
elliptic conditions 


a > 0 
c > 0 

ac — b 2 > 0 



( 4 ) 


Furthermore, it is assumed that 
d 2 d 2 

pd = a °Tl + C °T1 (°o >0,c o > 0) (5) 

ox dy 

where a 0 and c 0 are two arbitrary positive constants. When 
a uniform grid with grid intervals Ax and Ay in the x- and 
the y-directions is used, the central difference forms of 
equations ( 1 ) and ( 2 ) at a grid point (i,j) are 


2K;) = hij (6) 

and 

-<j)=-r[Q{ul)-h Uj ] (7) 

respectively. Here h t j is the source term and the finite 
difference operators Q and P, respectively, are defined by 

Q(Vij) d = - a(Ax)- 2 (v i+lJ + Vi _u - 2 v u ) 

+ c(Ay) ~ 2 (v iJ+i + v iy _, - 2 v,j) 

+ b(2AxAy) ~ l (v i+lJ+1 + 

~ v i+l,j-l ~ v i-lJ+l) 


( 8 ) 


and 

P(Vij) d =- a 0 (Ax) ~ 2 (v i+l j + Vi _u - 2 ViJ ) 

+ C a (Ay) " 2 (v,y + 1 + v,y_ ! - 2v,y) (9) 

where v tJ is any function of the grid point (ij). The operator 
P can be considered as a central difference Poisson operator 
for a uniform grid with grid intervals Ax/'/a,, and A y/Vc 0 . 
Thus, it may be inverted by using a fast Poisson solver. 

To study the convergence rate of iterative procedure (7), 
one notes that equations (6) and (7) imply that 

<io) 

where 

def. 1t n /in 

e iJ= Uij u it j (11) 

Given a set of efj s which satisfies the periodic conditions 

4j = e?+Kj = e?j +L (i,j = 0,±1,±2,...) (12) 

where K( > 2) and L( > 2) are two arbitrary integers, it is shown 
in appendix A that 

(1) e"j s (n = 1,2,...; i,j = 0,±1,±2,...) are uniquely 
determined by equation (10) and the following auxiliary 
conditions: 

e ij ~ e i+KJ = e l,j+L ( n = 1,2,...) (13) 

and 

(AT-l) (L-l) 

I] I] € U = 0 (/I =1,2,...) (14) 

i = 0 7=0 
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(2) Let 


<r p (W) <0 if (k,t) € * 


(22) 


vE exp “ f + f 


/-V=T 


(ij = 0, ±1,± 2,...; * = 0,1, 2,... ,(*-!); 


f = 0,1,2,..., (L — 1)) (15) 


a ^,Odef. _ 4 ^ _ sin („*/£) 


4- — sin (7T t/L) 


This inequality follows from the assumptions a 0 > 0 and 
> 0 , and the fact that ( 0 , 0 ) does not belong to 
As defined in equation (11), efj is the error of nth iterative 
solution. According to equation (21), this error is a sum of 
(Kx L- 1) terms, and each term is multiplied by the factor 
G(*'*)(t) as the iteration number n increases by 1. Obviously, 
the term with the greatest value of |G ( * ,f) (r)l eventually 
becomes dominant if the corresponding E does not 
vanish. Let the error norm \\e n \\ and the asymptotic error 
multiplication factor Af°°, respectively, be defined as 


\\e n \\t l 


(*- 1) t^-D / X2 T /2 

E E («&)* 

i=0 j = 0 


(k = 0,1,2 - 1); (=0,1,2 - 1)) 


f — 1 2 r “12 

a — sin (irk/K) + c — sin ( 7 rf/L) 
Ax Ay 


+ 2 b — sin ( itk/K ) — sin ( 7 xt/L) 

Ax Ay 


M°° d 5 f> lim 


Then, assuming every E 0,{kp?) ^ 0, it may be concluded that 


X cos (7 xk/K) cos (7r t/L) 


lim \e t y i \ 

n-.+cc — — = G(r) (i,j = 0,=t 1,±2,...) (25) 

c/ 1 


(* = 0,1,2,. ..,(*- 1); f= 0,1, 2,. ..,(1-1)) (17) 

(«■-!) (i-1) 

£ 0,(W)def. £ ^ <p?f> (*,*)€* (18) 

«= 0 )=0 

and 

G (*,0 (T )def. j _ T ^ (*,/) €* (19) 

where € means an element of, and ^ is the set of ordered pairs 
defined by 

^ { {(k,()\k = 0,l,2,...,(K- 1); 

( = 0,1,2,...,(L- 1); (k,0 * (0,0)] (20) 

Then the unique solution to equations (10), (13), and (14) is 
explicitly given by (n = 1,2,...) 

elj= £ [G<W(T)]".£°' ( «>. v fcO (21) 

(k,0€* 

One notes that G (W) (r) is well defined for all (k,t) € SI', 
since 


and 

M 00 = G(r) (26) 

where, for a given r, 

G(r) d J f Max [|G (W (r)l] 

(M)€* ( J 

A direct implication of equation (26) is that the value of 
M°° reaches its minimum if the parameter r is chosen such 
that the function G(r) is at its minimum. Let 

(*,f)def. °_q_ 

7 = ofO 

and 

7max= f ' Max [y (U) ] 

(kj)^ 

7mm = r Min [y (W) ] 

(*,()€* 

It is shown in equation (33) that Y max > y^ n > 0. As a result, 
one concludes from equation (19) that G(r) reaches its 
minimum 
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G ede f . G(j O) = 


S - 1 
E + 1 


< 1 


(27) 


Thus, in the limit of K,L-~ + oo, j° and G°, respectively, 
approach 


when 


T 


T o def. 


2 


"Ymax "h Tmin 


7* d l f ' 


^max "h ^min 


(28) 


and 


Here t° is the optimal relaxation factor, and 


G *de f . 


E* - 1 
E* + 1 


< 1 


^ def. Tmax 

y min 


(29) 


where 


(35) 


(36) 


Combining equations (26) and (27), one concludes that (1) 
M“ < 1 if t = t°, and (2) M 00 increases with an increase of 
E. 

The values of y max and 7 min , generally, are functions of the 
integers K and L. However, as will be shown, Y max and Y mjn 
approach two separate limits as the values of K and L increase. 
Let 


r» * def. ^max 

L = 7 — (37) 

''min 

Two comments on equations (33) to (35) are as follows: 

(1) The uniform bounds X max and X^, generally do not 
exist if the operator P is replaced by an operator of other type. 

(2) Since X max 4- X m j n = a 4- c, 


1(“ + £ + V (d-c) 2 + 4(^) 2 ) (30) T * = 

CL + c 


(38) 


and 

Xmin = f ' ^(a + c- V(d-c) 2 + 4(b) 2 ) (31) 


Using equations (30) to (32), (36), and (37), it can be shown 
that the parameter G* is a function of a> b , c , and c 0 la 0 . If 
the coefficients a , b , and c are known, G* becomes a function 
of the single variable cja 0 . As shown in appendix C, this 
function reaches its minimum 


where 


a d i f ‘ •— > 0 
a n 


c d l f - — > 0 


„ def. 




In appendix B, it is shown that 


r * def. 1^1 
^min = } — 

vac 


(39) 


when 


(32) 


c 0 c 


a 0 a 


(40) 


Furthermore, assuming cja 0 = c/a, it is shown in appendix 
C that 


\nax — 7 max — Ymin — ^min ^ 0 (33) 

and 

lim 7max — ^max 
K,L—+oo 

lim Ymin ^min 
oo 


r° — t* (41) 

for any finite integers K > 2 and L > 2. 

At this juncture, it is noted that equations (35) and (36) can 
also be derived (ref. 11), in a less rigorous fashion, by using 
a simple von Neumann analysis and theorem 1 in appendix B. 
The current analysis shows that the von Neumann analysis 
for equation (2.8) is justified only under many restricted 
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conditions. One of them, the uniqueness condition (14), 
generally is not required for other iterative procedures. 

This section concludes with a discussion on the possible 
generalization of the 2-D results to a space of higher 
dimension. In an N-dimensional space ( N > 2), equation (3) 
may be replaced by 


Q = 


N 


E 


Of 


(IV 


dx^dx v 


(42) 


where are real constants and x^ the independent variables. 
Furthermore, the elliptic condition (4) is replaced by the 
requirement that the matrix 

d d =' (<v) (43) 


is symmetric and positive definite (SPD). Also the operator 
P assumes the new form 


p »T^~ Ov > 0, /x = 1,2,3,...,#) (44) 

dx^ 

With the aid of equations (42) to (44) and theorem 1 in 
appendix B, equations (6) to (37) may be generalized in a 
straightforward manner. However, it should be cautioned that, 
for N > 2, the parameters \ max and A min are defined, 
respectively, as the greatest and the smallest eigenvalues of 
the SPD matrix 

A d i ( («„„) (45) 


P d i f ‘ ^ 


respectively, are replaced by a ijy by, and c f y. That is, the 
discretized values of a , b , and c at the grid point (ij). On the 
other hand, the coefficients a 0 and c 0 associated with the 
operator P (eq. (9)) are again assumed to be positive constants. 

The preceding definitions of P and Q are directly applicable 
to any internal grid point. On a periodic boundary, they are 
also applicable if the periodic conditions are invoked. 
Similarly, by using an extrapolation technique (ref. 12), the 
operators P and Q can be defined on a Neumann boundary. 

The relaxation factor t, in the VC version, is replaced by 
its grid-point dependent version r^-. Ideally, the values of r^-’s 
may be chosen such that the parameter M°° (eq. (24)) is 
minimized. Unfortunately, this approach is impractical because 
of the complexity arising from the variable nature of the 
coefficients of Q and the necessity to consider the boundary 
conditions. The alternative adopted in the current study is based 
on the following heuristic arguments: Recall that the analysis 
described in the previous section is a rigorous von Neumann 
analysis for equation (10). The results of this analysis are fully 
justified only under very restricted conditions. However, it 
is well known that the von Neumann analysis often gives useful 
results even when its application cannot be fully justified. 
Particularly, by freezing the variable coefficients at their values 
at the grid point under consideration, this analysis has been 
routinely used in the stability study of the numerical procedure 
solving PDE’s with variable coefficients. Because of the above 
considerations, the VC version of equation (38) is assumed 
to be 


Gy + Cij 


(47) 


where 


where 


a 


def. 

(XV - 


OL(XV 


(46) 


* def. > o and 


0 


Finally, it is noted that equations (38) to (41) have no trivial 
counterparts in a space with N > 2. 


Local Relaxation 

In this section, the numerical procedure developed in the 
previous section is extended to solve PDE’s with variable 
coefficients. To proceed, the operator Q is initially assumed 
to have the form defined in equation (3), with the 
understanding that the coefficients a, b, and c are functions 
of x and y subjected to the elliptic condition (4). 

In the variable coefficient (VC) version of the iterative 
procedure (7), the operator Q will be defined by using equation 
(8), with the understanding that the coefficients a, b , and c. 


In view of equations (25) and (36), it is also assumed that 


K/ +1 I 


lim 

ft — *+00 


G ij 




(48) 


where Gy is the local error multiplication factor defined by 


r* def. ^ 

gm EJ+ 1 


(49) 


The parameter £*• is the grid-point dependent version of E*. 
It will be evaluated by using equations (37) and (30) to (32) 
with the understanding that the coefficients a , b, and c , 
respectively, are replaced by a^, by, and c^. Let 
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E (<$ 


Max M 




G 00 = Max [Gy 
(fj)& 


where $ denotes the set of (/ J)’s where w^-’s are to be solved. 
Then, with the aid of equation (24), assumption (48) implies 


Several comments can be made relating to equation (52). 

(1) Since G 00 can be evaluated by using the known 
coefficients a 0 , c 0 , a ijf b ij9 and c ij9 the value of M°°, and thus 
the convergence rate of the current iterative procedure, can 
be predicted by using equation (52). 

(2) As long as the coefficients ay, by, and Cy do not vary 
greatly from one grid point to its neighbors, the value of G°° 
is not sensitive to the grid-cell size and aspect ratio. This 
observation coupled with equation (52) implies that the 
convergence behavior of the current numerical procedure, 
generally, may not be sensitive to the grid-cell size and aspect 
ratio. 

(3) The VC version of equation (7) can be expressed in a 
form in which the coefficients a 0 and c 0 appear only in the 
ratio cja 0 . As a result, the convergence behavior of the 
current iterative procedure is dependent on the ratio cja 0 , but 
not on the individual values of a 0 and c Q . Similarly, one can 
also show that the parameter G°° is dependent on the ratio 
cja 0 , but not on the individual values of a Q and c Q . Equation 
(52) suggests that, in order to maximize the convergence rate, 
the ratio c 0 !a o should be chosen such that G°° is at its 
minimum. 

Evaluation of the optimal value of cja 0 , generally, may 
involve complicated numerical calculations. However, in case 
that by = 0 for all (i,j) € it is shown in appendix D that G 00 
reaches its minimum 

G oo def. ^Anax^min — 1 

min= rw~ — (53) 

^^max^min "b 1 

if and only if 




may be evaluated either analytically or numerically. 

The current procedure can be modified to solve a class of 
self-adjoint PDE’s. That is, 


Q = e (*>y) = — j + — [q(x,y)—j (56) 

where p and q are arbitrary positive functions of x and y. A 
central difference operator Q + corresponding to the 
differential operator g + is defined by (ref. 13) 

Q + ( V i,j) d =' (Ax) 2 + P(i+\/2)j v i + \J 

~ (P(i-l/2)j + P(i+l/2)j)vi t j\ 


+ (Ay) 2 ^ q i(j _ i /2 )Vij - 1 + q i(j+ 

“ (?/(/- 1/2) + <lHj+\/2))vij 


where 


Pu±m/= P(x t =fc Axl2,yj) 


%(j±u 2 ) » Q(x h yj ± Ay/2) 

With the assumption that coefficients p and q do not vary 
greatly from one grid point to its neighbors, then 

e + (v;j)=A/A*r 2 (V;-ij + v i+ u - 2v t J) 

+ q ij (Ay)- 2 (v i j_i + v iJ+ i - 2 v tJ ) 

Thus Q + (Vij) = Q(Vij ) (eq. (8)), if a v = p i} , cy = q tj , and 
by — 0 for all (ij) € This observation coupled with 
equation (47) lead to the assumption 




where 


Pij + No- 
where py 6 =‘ pyla 0 and q t f~' qy/c 0 . Similarly, in the case that 
Q = Q+(x,y), the parameter G°° will be evaluated by 
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assuming a % = Py, Cy = and by = 0. Also the right sides 
of equations (53) and (54) will be evaluated with 


( 64 ) 


&nax= f - Max 

UJ)& 





(59) 


The technique of local relaxation described for 2-D 
problems, can be applied in a similar fashion, to 3-D problems. 
The value of this technique as a tool to solve PDE’s with 
variable coefficients will be demonstrated in the subsequent 
sections. 


Numerical Evaluation 

Numerical evaluation of the current method begins with the 
following preliminaries: 

(1) In this section the domain for all numerical problems 
is assumed to be 1 > x > 0 and 1 > y > 0. Moreover, the 
operator P is inverted by using a Fast Poisson Solver (ref. 14). 

(2) The convergence rate is evaluated by using 


O e (n)¥- -log 10 



(60) 


or 


O r (n) = f - - log 10 



(61) 


0‘(n) d = lim 0,(n) 

Ax,Ay~*0 

Numerical evaluation involving PDE’s with constant 
coefficients are given in reference 15. The first group of PDE’s 
to be studied includes 


o o d 2 u ~ o d 2 u 

(1 +2x 2 + 2y 2 )-^ + (l +x 2 +3> 2 )— 1 (65) 

dx dx 

, , d 2 u „ , ,, d 2 u 

(1 + 2x 2 + 2y 2 )— + (1 +x 2 + y 2 ) — 
dx dxdy 




+ (1 + x 2 + y 2 ) = 1 (66) 

dy 2 

(1 + 2x 

9 ~ 9 3 2 M ? 9. d 2 W „ 9 9 d 2 W 

2 + 2y 2 ) 2 + (1 +x 2 + y 1 ) +(1 +x 2 + y 2 ) 

dx dxdy dy z 


+ i 

; 1+3e <- , ^)]^ + [, +3e <' s+ » J >]| 


- 

[l +3<t<' !+ > Z, ]u= 1 (67) 

1 1 

i+c* 2 +yV; 

J 

Is] [{“«+*]%] 



= 1 ( = 2, 4, 6, 8 (68) 


where e n is the error norm defined in equation (50), while 
II r' 1 II is the residual norm defined by 



The solution u t j obtained to machine accuracy is used to 
evaluate O e (n). Furthermore, since w-j = o for all (i,j) € $ 
in the current numerical study, O e (n) can be interpreted as 
the number of correct digits in u n ltJ . 

(3) In view of equation (52), the parameters O e (n) and 
O r {n) will be predicted by using 

O t (n) = -n-log^G 00 ) (63) 

or its continuous version; that is, 


Fifteen numerical problems associated with the above PDE’s 
are defined in table I. The parameters MX and MY, 
respectively, are the numbers of grid intervals in the x and 
y directions. The other parameter IB specifies the particular 
set of boundary conditions (fig. 1). All these problems are 
solved by assuming a 0 = c 0 = 1 . 

Problems 1 to 5 are associated with the same PDE (65). They 
differ on the grid-cell size, aspect ratio, and boundary 
conditions. As shown in table I, the values of either 0 r (2O) 
or 0* t ( 20) are fairly accurate estimates of 0/20). Also, as 
expected from the current theoretical development and the 
experiences of other researchers (refs. 7 and 8), the effects 
of grid-cell size and aspect ratio on the convergence rate are 
minimal. Even the very large aspect ratio (16: 1) does not cause 
any significant reduction in the convergence rate. Furthermore, 
the convergence rate is insensitive to the particular set of 
boundary conditions used. 
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TABLE I . —NUMERICAL PROBLEMS ASSOCIATED WITH 
EQUATIONS (65) TO (68) AND COMPARISONS OF O r (n), 
O t (n), AND 0](n) 


[n = 20 for problems 1 to 5, n = 32 for problems 6 to 15.] 



u c 0 u{x, 1) = u(x, 0) duldy = 0 



(a) IB = 1. (b) IB - 2. <c) IB = 3. 


Figure L— Three sets of boundary conditions on a unit square. 


Problems 6 to 10 are associated with equation (66), which 
differs from equation (65) only in the appearance of a cross- 
derivative term. The numerical results indicate that the 
convergence rate may be substantially underestimated by the 
parameter 0/32) or 0 r *(32). Furthermore, it is more sensitive 
to the change of grid-cell size and aspect ratio. An explanation 
for these peculiar behaviors associated with a PDE with a 
cross-derivative term is given at the end of appendix B. 

The success of the current numerical method in solving a 
PDE with a cross derivative term is rather significant. This 
author is unaware of any earlier work which solves PDE’s 
of this type with a semidirect procedure. The lack of progress 
in this area may be due to the fact that it is very difficult to 
choose a separable operator P which closely resembles a 
nonseparable operator Q containing a cross-derivative term. 
(By definition, a separable operator P can not have a cross- 
derivative term.) 

8 


Equation (67) contains first-order and zero-order derivative 
terms. This type of PDE is solved by simply adding the central 
difference form of those terms to the term Q(u n it] ) in 
equation (7). The value of 0 r (32) for problem 11 indicates 
that the current procedure works very well even though the 
coefficients of first-order and zero-order derivative terms in 
equation (67) are of the same order of magnitude as the 
second-order terms. This is rather unexpected because the 
coefficients of lower order terms are completely neglected in 
the evaluation of the local relaxation factor. 

Equation (68) belongs to the class of self-adjoint PDE’s 
defined in equation (56). The variation of the values of the 
coefficients p and q increases progressively as one goes from 
i = 2 to l = 4 and so on. For l = 8, the increase in the values 
of p and q from one corner (x = y = 0) to another corner 
(x = y = 1) on the unit square is of the order of 100 times. 
It might appear that the technique of local relaxation is no 
longer valid. The results shown in table I indicate the current 
method is still useful in this extreme case. 

The numerical study of problems 1 to 15 concludes with 
a discussion on their convergence histories. Since equation (63) 
represents a linear relation between O t (n) and n, it is not 
surprising that the relations between O r (n) and n curves are 
closely approximated by straight lines for the above problems 
with the exception of perhaps problems 13 to 15. As shown 
in figure 2, the linear relation between O r (n) and n 
gradually deteriorates as the variation of the coefficients p and 
q increases. The robustness of the current algorithm is most 
evident in its ability to reverse the trend toward divergence 
during the first few iterations. 

The second group of PDE’s to be studied includes 



d_ 

dx 


i+l(-* 4 +/) 



+ ^ 1 + 1 (x 4 + /) ^ j = h 2 (x,y) (70) 

where h x ( x,y ) and h 2 (x,y) are source terms chosen such that 
u = U\ (x,y) d = f * sin x sin y (71) 

and 

u = u 2 (x,y) d =- [x(l - x)y( 1 - y)] 2 (72) 

respectively, are the exact solutions of equations (69) and (70). 








Figure 2.— Convergence histories of problem numbers 12 to 15. 


Definitions of the four numerical problems associated with 
equations (69) and (70) along with the values of O e ( 10), 
0 r ( 10), and 0,(10) are given in table II. The boundary values 
of u in these problems are specified by using equation (71) 
or equation (72). 

Problem 16 is one of the test problems used by Bank 
(ref. 8). Compared with the current value of O e ( 10) = 5.96, 
the values obtained by Bank are 3.49 without using any scaling 
technique, and 3.87, 4.81, and 6.76 using three different 
scaling functions. Since the operator P used in Bank’s method 
is a general separable operator, the corresponding FDS code 
usually must be made to individual specifications and is about 
five times slower than that for the Laplacian V 2 . Thus, the 
current algorithm is easier to use and, for problem 16, more 
efficient by at least a factor of 4. 

Problem 18 is another test problem used by Bank. Compared 
with the current value of O e (\0) = 8.59, the value obtained 
by Bank is 5.88 without using his scaling technique and 14.79 
if a scaling function is used. This problem along with problem 
19 was also solved by Concus and Golub (ref. 7). The method 
of Concus and Golub is also driven by a fast Poisson solver 
and the results obtained are comparable with ours. However, 
their method is applicable only when p — q as in the case of 
equation (70). 


The last PDE’s to be studied in this section are 


d 2 u + d ( 
dx 2 dy | 


1 + -(*-y) 


d M=o 

dy 


( 73 ) 


d 2 u d j 

dx 2 dy I 


10 + - (x - y) 
2 


* -o 

dy\ 


(74) 


An exact solution for both equations (73) and (74) is 


u = y + - x 2 


(75) 


Equations (73) and (74) were numerically solved by using a 
grid with MX = MY = 31 . It is assumed that the values of u 
at all boundary grid points are specified by using equation (75). 
To compare the efficiency of the current method with 
traditional iterative methods, these numerical problems are 
solved by using the current method along with SLOR 
(successive line overrelaxation). The results of central 
processing unit (CPU) time comparisons are summarized in 
table III. Those parameters used in this table which were not 
defined previously are as follows: 

N c smallest value of n which satisfies the convergence 
criterion 

(Ax) 2 . Hr" II < 10 “ 8 (76) 

where Hr" II is defined in equation (62) and Ax = 1/31 

Tf CPU time used in the execution of the FDS code 

T t total CPU time (IBM 370/3033AP) used to satisfy the 
convergence criterion (76) 

0 ) o optimal value of the relaxation factor used in the SLOR 
method (determined by repeated numerical experiments) 

According to table III, the total CPU time required for the 
solution of either equation (73) or equation (74) with SLOR 
is about twice that with the current method. This comparison 
becomes even more favorable toward the current method if 


TABLE II.— NUMERICAL PROBLEMS ASSOCIATED WITH 
EQUATIONS (69) and (70), AND COMPARISONS OF 
0,(10), O r ( 10), and 0,(10) 


TABLE III — CPU TIME COMPARISONS BETWEEN 
CURRENT METHOD AND SLOR METHOD 


Equation 

Solution 

method 

cja 0 


N c 

T„ 

sec 

sec 

(73) 

Current 

a 0.8839 

NA 

13 

1.871 

1.345 

(73) 

SLOR 

NA 

1.752 

83 

3.790 

NA 

(74) 

Current 

"9.989 

NA 

6 

0.888 

0.614 

(74) 

SLOR 

NA 

1.510 

44 

2.039 

NA 


Problem 

number 

Equation 

MX 

MY 

c o la 0 

0,00) 

0,(10) 

0,(10) 

16 

(69) 

16 

16 

"0.423 

5.96 

5.33 

3.92 

17 

(69) 

64 

64 

“0.379 

5.27 

4.22 

3.47 

18 

(70) 

16 

16 

“1.0 

8.59 

8.09 

oo 

19 

(70) 

64 

64 

“1.0 

8.47 

7.95 

oo 


Evaluated from equation (54). a Evaluate from equation (54). 


9 





one recalls that the prediction of 0 ) o is elusive. A small error 
in this prediction may result in a large increase in the value 
of T t . For example, in the solution of equation (73) with 
SLOR, a change of the value of the relaxation factor from 1.752 
to 1.680 results in an increase in the value of T t from 3.790 
to 6.565 seconds. On the other hand, as shown in table IV, 
the optimal value of cja 0 evaluated by using equation (54) 
usually is very accurate. Moreover, since the fast Poisson 
solver (ref. 14) currently used is a general purpose code, the 
value of Tf can be reduced further if the fast Poisson solver 
is optimized. 

To conclude this section, the current local relaxation 
procedure is compared numerically with a procedure which 
differs from the former only in the use of a constant relaxation 
factor t c . With the assumption of a Q = c 0 = 1, problem 16 
was solved with different values of r c . As shown in table V, 
O e ( 10) reaches its best value (== 2.278) at t c = 0.103. Even 
this best value is substantially below that (== 3.47) obtained 
by using the local relaxation method (a 0 = c 0 = 1). 
Furthermore, the accurate prediction of optimal r c is by no 
means easy (e.g., pp. 964 to 966 of ref. 8). Thus the current 
procedure has a clear edge over a procedure that uses a 
constant relaxation factor. 


Application to a Three-Dimensional 
Flow Problem 

In this section, the current semidirect procedure is 
incorporated into an Euler solver (ref. 9) to obtain the inviscid 
solution for 3-D steady incompressible rotational flow in a 
180° turning channel (fig. 3). 

The Euler solver is formed by the inner and the outer loops. 
The inner loop solves the elliptic equations, while the outer 
loop solves the hyperbolic equations. In each pass through the 


TABLE IV. -N c AND 
O r ( 13) AS FUNCTIONS 
OF cja 0 IN THE 
NUMERICAL 
SOLUTION OF 
EQUATION (73) 


C d&o 

N c 

O r (13) 

0.8 

14 

8.3343 

a .8839 

13 

9.4040 

.895 

13 

9.4891 

.897 

13 

9.4955 

.899 

13 

9.4987 

b .900 

13 

9.4990 

.901 

13 

9.4985 

.903 

13 

9.4949 

.905 

13 

9.4882 

1.0 

14 

8.4831 


Evaluated from equation (54). 
b Actual optimal value. 


TABLE V.- 
O e (\0) AS A 
FUNCTION OF 
t c FOR 
PROBLEM 16 


Tc 

0,(10) 

0.02 

0.6 

.06 

1.514 

.10 

2.259 

.102 

2.276 

.103 

2.278 

.104 

2.272 

.105 

2.255 

.11 

1.989 

.12 

1.2 

.13 

(a) 


a Do not converge if 
t c > 0.13. 



Figure 3.— A converging and diverging turning channel (* 3 is suppressed). 

inner loop, the velocity field V is updated to satisfy the 
continuity equation 

V • F= 0 (77) 

and the velocity-vorticity relation 

V X V= Q (78) 

where fi is a known divergence-free (i.e., V • O = 0) 
vorticity field. In the current study, a solution procedure 
different from that described in reference 9 is used to solve 
equations (77) and (78). The general solution of equations (77) 
and (78) can be expressed as 

V=V s +Vu (79) 

where V s is any special solution of equation (78) and u is a 
solution of 

V 2 n = — V • (80) 

As a result, once a special solution V s is obtained (ref. 9), the 
solution of equation (80) becomes the focal point of the inner- 
loop calculations. 

Let the coordinates (x Xt x 2 ,x 3 ) refer to physical space and 
(x lf x 2 ,x 3 ) to computational space. It is shown in reference 9 
that the turning channel in figure 3 is a mapping of a 
parallelepiped (2 > xj > -2, 0.75 > x 2 > 0.65, 0. 1 > x 3 > 0) 
in computational space. In physical space, equation (80) is a 
Poisson’s equation. However, it cannot be solved by using 
the current procedure, since the physical domain is not a 
parallelepiped. On the other hand, in computational space, the 
domain is a parallelepiped and equation (80) becomes 


d 2 u d 2 u d 2 u 

dx 2 dx 2 77 dx 2 



dV s> 2 
dx 2 


+ V 


dVs, 3 \ 

3*3 / 


( 81 ) 
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where V St i,V Si2 , and K 1j3 are the covariant components of the evaluated from equation (83). The convergence rates achieved 

known vector field Pj and by using the current procedure are dramatic improvements 

over those obtained in reference 9. 


v = v(x.y) d =' 


cosh («!) + cos (« 2 ) 
cosh ( irxi ) — cos (« 2 ) 


(82) 


Concluding Remarks 


The 3-D operator Q associated with equation (81) is a special 
case of that defined in equation (42). Let the parameter G°° 
and the set $ for 3-D problems be defined similar to their 
counterparts for 2-D problems. Using a line of arguments 
similar to that presented in appendix D shows that the 
parameter G°° reaches its minimum 


poo def. 
^min — 




(83) 


if and only if the coefficients of the operator P (eq. (44)) are 
chosen such that 


— = — = ( 84 ) 
Pi Pi 

Here *? max and r? min , respectively, are the maximum and 
minimum of the function q in $. 

In a successful effort to obtain a secondary flow solution 
in the turning channel, equation (81) was solved once during 
each of 25 passes through the inner loop (The source terms 
on the right side of equation (81) vary from one pass to 
another.) The central difference form of equation (81) is 
obtained by using a grid with 144 uniform intervals in the 
x r direction and 12 uniform intervals in both the x 2 - and 
jc 3 -directions. It is assumed that the normal derivative of u 
vanishes at all boundaries except at the exit plane (x { =2) 
where u = 0. Thus ?7 max — rj( — 2,0.65) == 0.9966 and 
rjmin = ^(0,0.75) = 0.1716. As suggested by equation (84), 
equation (81) was solved by assuming P { = P 2 = 1 and 
P = 0.4135. The values of 6> r (8) obtained range from 3.69 
to 4.48. All are higher than the value of O r (8) = 3.07 as 


An efficient semidirect procedure for solving a large class 
of nonseparable elliptic problems has been developed. In 
applying this method, a user simply evaluates the terms on 
the right side of equation (7) and uses them as the input for 
a fast Poisson solver. The user is not required to deal with 
a large sparse matrix as in the case of a traditional iterative 
procedure. 

The local relaxation factor is evaluated by using an algebraic 
formula. This formula along with a convergence rate prediction 
method is developed by using a heuristic argument. It is shown 
numerically that the prediction method is an effective tool for 
estimating the convergence rate. 

The convergence rate can be accelerated by optimizing the 
coefficients of the finite-difference operator P. It is shown that 
this optimization can be carried out easily for a large class 
of elliptic PDE’s. 

It is also shown that the convergence rate of the current 
procedure is relatively insensitive to the grid-cell size and 
aspect ratio. The underlying reason for this insensitivity is the 
existence of the uniform bounds A max and A^. Their 
existence also contributes greatly to the simplicity of the 
current procedure. 

Although not shown in this report, the current procedure 
may also be used to solve a second order quasi-linear elliptic 
PDE. Since the coefficients of this PDE are functions of the 
dependent variable and its derivatives, the local relaxation 
factor must be updated during each iteration for this type of 
application. 


National Aeronautics and Space Administration 
Lewis Research Center 
Cleveland, Ohio, January 13, 1986 
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Appendix A 

Derivation of Equation (21) 


In this appendix, we will derive equation (21) by using 
equations (10) to (20). To proceed, we define the sets 
(n = 0,1,2,...) 

e"= f - (^|/ = 0,l,2,...,(X-l);y = 0,l,2,...,(L-l)] (Al) 

As a result of equations (12) and (13), for a given n > 0, any 
e ?j ( i>j = 0, ± 1, ± 2,...) is equal to an element of e n . Each 
e n contains K X L elements and, therefore, can be considered 
as a vector in a (K X /((-dimensional vector space C <K/I ' > . 
Similarly, since 


Each vector e n can be expressed as a linear combination of 
<P (k,l) ’s. This fact coupled with equations (12), (13), (A2), 
and (A5) implies that 

(tf-l) {L- 1) 

4= E E 

k = 0 1=0 

(n = 0,1,2,...; i,j = 0, ± 1, ± 2,...) (A8) 

where 


(k,0 (k,t) (k,() ^ 

<Pi,j = <Pi+KJ = rij+L (fj = 0, ± 1, ± 2,...) 


(A2) 


any <Pij' e) (i,j = 0, ± 1, ± 2,...) is equal to an element of the 
set 


(*-l) (L-l) 


£"•<*,<) ttef. ^ ^ efjvfr 0 

i '=0 ;=0 


(A9) 


•P (W) = f {<P?f | i = o, 1 ,2, . . . ,(K - 1 ); 


Substituting equation (A8) into (10) and using equations (A5) 
to (A7), one obtains 


y = 0,1,2,. ..,(L— 1)] (A3) 

Each y> (W) can also be considered as a vector in C (KxL) . It 
can be shown that the set of ( K X L) vectors 

^ <W >|* = 0,1,2,...,(X- 1); f = 0,1,2,...,(L — 1)} (A4) 

forms an orthonormal basis for C (KxL) ; that is, 

(A-l) (L— l) 

E E <#%?/> = <*■<* 

1=0 7=0 


(E n +h(k,t) _ E n,(k,t)^ a (k,t) _ _ rE n,(k,l) a (k,l) 

(n = 0,1,2,...; k = 0,1,2,. 1); 

f = 0,1,2,..., (Z, — 1)) (A10) 

For k = ( = 0, equation (A10) is an identity since 
<’ 0) = <7 9 ° , ° ) = 0* O n the other hand, since a^ k,t) < 0 if 
( k ,( ) € , equation (A 10) implies that 

£«+i,(W) = [ G (W) (t) ] .£».<«> (jM) € * (All) 


(A:,/: = 0,1,2,...,(X — 1); f,l" = 0,1,2,...,(L — 1)) (A5) where G <k,i> (T) and V, respectively, are defined in equations 

(19) and (20). Equation (21) is an immediate result of equations 
(A8) and (All) and the assumption 

where b kk > is the Kronecker delta symbol. Also, it can be 

shown that £"’ (0 > 0) = 0 (n = 1,2,3,...) (A12) 


a wy>) - °i u vsy> 

and 


(A6) 


fiW** 0 ) = (A7) 

where P, <2> and are defined in the section 

Analysis. 


It should be noted that equation (A 12) is introduced to ensure 
the uniqueness of the solution. Using equation (A9) and the 
fact that <pfj 0) = I/VaX for ij = 0, ± 1, ±2,..., it can be 
shown that equation (A12) is equivalent to equation (14). 
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Appendix B 

Proof of Expressions (33) and (34) 


Expressions (33) and (34) will be established by using the 
following theorems: 


Theorem 1 

Let Xj and \ N , respectively, be the greatest and the smallest 
eigenvalues of an N x N real symmetric matrix D d =- (d^). 
Let real vectors 5 = (si,s 29 -..,s n ) and t = (fi,r 2> ...,^) satisfy 

N 

E (V 2 = 1 cbd 

^-1 

and 

(g 2 <l (m = 1,2 AO (B2) 

Then the following may be stated: 

N 

(1) X, > 

/X=l 


N N 

+ L L - x w ( B3 ) 

}X=\ v— 1 


(2) If, in addition, for /x = 1,2,...,N, then 

(D,*5,0 = Xi if and only if 1— (t fJ ) 2 = 0, for fx— 1 , 2 ,..., 
A, and is an eigenvector of the matrix D 

with eigenvalue X t . This statement remains true if is 
replaced by X N . 

(3) If 5 (1 < 8 < iV) is an integer such that d d8 — X 1? then 
= 0 for all v ^ 6(1 < p < N). As a result, ( D;s,t ) is 

independent of the component This statement remains true 
if Xj is replaced by X^. 

This theorem is a special case of theorem 1 in reference 16. 


Theorem 2 

Let a, h , and c be the constants defined in expression (32). 
Let 


sj > 0 ^ 

5 2 > 0 j 

and 

(h) 2 <\ 


(B6) 


(B7) 


Then the following may be stated: 

(1) A max > F > A min > 0 (B8) 

where X max and X min , respectively, are defined in equations 
(30) and (31). 

(2) If b 0 and a c, then f X max X m j n 

(3) If b = 0 and c > a (a > c), then 

(a) F = \ max if and only if s, = 0 (s 2 = 0) 

(b) F = Xmin if and only if s 2 = 0 (si = 0) 

(4) If b ^ 0, then 

(a) F = X max if and only if either 

Sy = sf 
s 2 = s 2 


1 1 = 1 


h 


b 


1*1 


or 


si = s* 
S2 = S2 + 
t { = -1 


h ~ 


b 

1*1 


F(s\,s 2 ,t\>h) i £' &(S\) 2 + b(s z ) 2 + 2bsyS 2 tyt 2 (B4) where 


where s 2 , t\, and r 2 are real variables and satisfy 
(s,) 2 + (s 2 ) 2 = 1 (B5) 


Sy +d f 


1*1 

V ( X max - «) 2 + (b ) 2 


>0 


(B9) 
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and 


To prove expressions (33) and (34), one notes that 




Xmax <3 

~ a ) 2 + (b ) 2 


> 0 


(BIO) 


y (k,l) def. 


a m 

(*,f) — a(s x ) + £(s y ) 2 


(b) F = X min if and only if either 


+ 2 hs x Syt x t y (k,t) 6 (B13) 


s 2 — ^2 


where a^ l \ a^ l \ a, h, t, and ¥ are defined in the section 
Analysis, and 




where (W) € ^ (B15) 



t x = ' cos(?r k/K) (k = 0,1, 2, ...,(/£■ — 1)) (B16) 

and 

f >= f ' cos (nVL) (f = 0, 1 ,2, . . . ,(L - 1)) (B17) 

Using equations (B14) to (B17), one concludes that (1) 
(s x f + (^) 2 = 1 , (2) s x >0, s y > 0, and (3) (t x f < 1 , 

(t y ) 2 < 1. As a result, part (1) of theorem 2 implies that 

Xmax ^ 7 (U) ^ x min > 0 (k,() € T (B18) 

Inequality (33) follows immediately from inequality (B18). 

Using parts (1) to (3) of theorem 2 and the facts that (1) 
■s* = 0 if & = 0 and f = 1,2,...(L- 1) and (2) s y = 0 if i = 0 
and k = 1,2 — 1), it can be shown that 


and (2) 

\nax ~ \nin >0, theorem 2 follows directly from 
theorem 1. QED 


^max Tmax 
\nin “ Tmin 


(£ = o) 
(b = 0 ) 


(B19) 
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Thus, for h = 0, equation (34) is true and it represents a 
condition weaker than equation (B19). 

To prove equation (34) for b 0, note that 


^ (-*• u j!i) 

-i) 


> I-Sjc — >Sl + I 


«*>!<*- 1 | 


C-y ^ ty 



(B21) 


To proceed, note that, without any loss of generality, it may 
be assumed that 


TJ*' (*f. »f. 1. iJf) 

T ('''' ** _1 ’ |f|) 


belong to 

D(F) d «- ((5„5 2 ^1^2)|(5l) 2 + (Slf = 1, 

5, > 0, s 2 > 0, (r,) 2 < 1, (t 2 ) 2 < l] 

However, since ( t \ ) 2 + ( t 2 ) 2 = 2 for 71 , 72 , 73 , and 74 while 
(O 2 + C t y ) 2 < 2 for all (k,l) € 'F, it follows that 7 ^ y 2i 73 , 
and 74 do not belong to the set 

r¥- (fe,v*,^)l(W)€*) 

Thus inequality (B18) and part (4) of theorem 2 imply that 

\nax ^ Ymax — Ymin ^ ^min ^ (B20) 

Furthermore, since (1) the function F is a continuous 
function over its domain D(F ), (2) T is a subset of D(F ), and 
( 3 ) for any neighborhood of any one of 71 , 72 , 73 , and 74 , one 
can find a pair of integers K 0 and L 0 (see below) such that 
the intersection of this neighborhood and T is not null if 
K> K 0 and L>L 0 . (Recall that both ^ and T are dependent 
on the integers K and L.) Thus, one concludes that equation 
(34) is valid for b 5 * 0. 

As an example, the existence of the integers K 0 and L 0 
referred to in (3) will be established as follows for the point 
7 ! with the assumption b x > 0 . 


Proof 

Since fe) 2 + (s y ) 2 = (s x ) 2 + (s 2 ) 2 = 1, it need only be 
proven that, given any e x > 0, e y > 0, and d x > 0, there exist 
K 0 and L 0 such that for any K>K 0 and L > L 0 , two integers 
k (K > k > 0) and t (L > l > 0) can be found to satisfy the 
following conditions: 


sf >6 X 1 - > <5* (B 22 ) 

With the aid of the assumption b > 0 and expressions (B14), 
(B16), (B17), and (B22), equation (B21) can be rewritten as 


, sin (7r 2/L) 

7 ) + > 1 > Yj 

sin ( 7 rk/K) 
irk 


e x > 2 sin 2 


2 K 


e v > 2 sin 2 — 

\ 2L , 


where 



Vl - (Sl + - fix? 


s i + - S, 


v 1 - (v r + w 
s 1 + + &X 


> V' 


>0 


(B23) 


It is easy to show that there exist two integers k 0 > 1 and 
t Q > 1 such that 


> k ±± > Lz ± >1l - 

* 0-1 K + 1 


(B24) 


Since 



lim 

sin [(4 + l)z] 

_ 4 + 1 

z-0 

sin [(fc D - l)z] 

K - 1 

lim 

sin [(4 - l)z] 

_ 4 - 1 

z-*0 

sin [(*„ + l)z] 

K + 1 
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One can find an integer M » Max [k 0 ,Q such that 


If 



sin [?r(l 0 + 1 )/M\ sin [tt(4 - \)!M] 

sin [t(Ic 0 - 1 )/M] sin [ir(k 0 + 1 )/M] 


> V 


(B25) 


and 


k d ?k 0 (ak 0 + l3) 

+ 0') 

are chosen, with the aid of expressions (B28) and (B29), it 
can be shown that 




(B31) 


Let 

K 0 d =- Mk 0 

L 0 ^Mt o 


(B27) 


For any K > K a and L > L () , there exist six unique integers 
a, 0, y, a', (3', and y' such that 


K = aMk 0 + /3M + y 
L = a 'M( 0 + 0'M + y' 


(B28) 


and 


a > 1 > /3 > 0 M > 7 > 0 

a' > 1 4 > j 8 ' > 0 M>y’ > 0 


(B29) 


Using inequalities (B25), (B26), and (B31), and the fact that 
M » Max [k 0 ,Q, it can be shown that inequality (B23) 
along with the conditions K > k > 0 and L > ( > 0 are 
satisfied if k and i are chosen according to equation (B30). 
QED 

This appendix concludes with a discussion on expressions 
(B19) and (B20). With the aid of equations (27) to (29) and 
(35) to (37), equation (B19) implies that r* = t° and G* = G° 
when h = 0 even if the integers K and L are finite. On the 
other hand, for h ^ 0, inequality (B20) implies that (1) 
G* > G° always and (2) r° * r‘ unless Ymax + 7 min = ^max 
+ ^min- Since the current local relaxation procedure and 
convergence rate prediction method are developed from the 
assumptions that G* = G° and r* = r°, one may expect that 
the current procedure works less well, and the predictions 
given by the parameter 0,(n) become more conservative for 
a PDE with a cross-derivative term. The second part of the 
above expectations was confirmed by the numerical results 
shown in the section Numerical Evaluation. 
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Appendix C 

Derivation of Equations (39) to (41) 


With the aid of equations (30) to (32), equation (37) can 
be rewritten as 

v . _ aa 0 + c + ^ (aa 0 - c) 2 + 4b 2 a 0 

L 7 , . - 1 (ci) 

aa 0 + c - v (aa 0 - c) z + 46 z a 0 


where a 0 d I f ' c 0 /a 0 > 0. It should be noted that, as a result 
of equation (4), the denominator of the fraction in equation 
(Cl) is always positive. In view of equations (36) and (Cl), 
one may consider the parameter G* as a function of a, b, c, 
and a 0 , and obtains 


dG ’ 


8 a(ac - b 2 )(a„ - c/a) 


da 0 


(E*+ l) 2 ^(acx 0 -cf+4b 2 a 0 


12 

aa 0 +c-^(aa 0 -c) 2 +4b 2 a 0 


(C2) 


where (aa Q - c ) 2 + 4 b 2 a 0 ^ 0 is assumed. Equation (C2) 
coupled with equation (4) implies that (1) dG*/da 0 < 0 if 
a 0 < c/a, and (2) dG‘/da 0 > 0 if a„ > c/a. Equations (39) 
and (40) are the direct results of properties (1) and (2). 

To prove equation (41), one notes that a = c if c 0 /a o = c/a. 
Thus equations (B13) to (B17) imply that 

(!) y(k,t) + y(K~k,t) = y (k,l) + y(k,L-t) = 2a 

and 


(2) y (0 ’^ = y( l ’°) = d 

for k = 1,2,...,(A"-1) and ( = 1,2,...,(L- 1). Consequently 
one concludes that 

Tmax "b Tmin — ^max "b ^min = 2d (C3) 

Equation (41) follows directly from equations (28), (35), and 
(C3). 
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Appendix D 

Derivation of Equations (53) and (54) 


To obtain the results given in equations (53) and (54), the 
parameters a 0 4 i L c 0 la 0 > 0 and 0,/^ c tJ la tJ > 0 are 
introduced. With the assumption that by = 0 for all (i,j) € 
equations (30) to (32), (37), and (49) can be used to show that 


Gy = J (<X 0 /(3y) 



(<X o /0jj) ~ 1 
(&o/ftij) + 1 


if ajfiy > 1 


1 - (a 0 /(3y) 

(<X 0 I @ij) + 1 


if 1 > a 0 /f3y > 0 

(Dl) 


Note that 


da,, 


> 0 if a 0 > a m 


dJ_ (« 0 ) 

da,, 


< 0 if a 0 < a m 


where a m d l f ' V0 max .0 min 


(D4) 


Proof 

As a result of equation (D2), J(a 0 ) equals to the greater 
of J(a o /0 m j and /(a o /0 max ). Since (1) a m /0 min > 1, (2) 
max < 1) and (3) 7(a m /0 min ) = 7(a m /0 max ), one 
concludes that 


dJ(j) 

dt 



> 0 if t > 1 


<0 if 1 > r > 0 


(D2) 


That is, the function J(t) increases monotonically if t > 1 and 
decreases monotonically if 1 > t > 0. 

Let 0max and 0 min be the parameters defined in 
equation (55) and, for a given o >. 0 , 


/(a 0 ) d J f ' Max 


{j(a 0 /(3y)} 


(D3) 


Assuming 0 max > 0 min , it can be shown that 


„ C nin) if — a n ^\ 

J(0t o ) = ( (D5) 

d ( oi 0 l ^ max ) if <x 0 ^ a m ) 

Inequality (D4) is a direct result of equations (D2) and (D5). 
QED 

Inequality (D4) implies that J (a 0 ) increases monotonically if 
a 0 > a m and decreases monotonically if a„ < a m . Since 
j(a 0 ) - G°° (eqs. (51), (Dl), and (D3)), equations (53) and 
(54) simply state the fact that J ( a n ) reaches its minimum 
J(aJ = J (a m /0 mi „) = G* in if and only if a 0 = a m . 

In case that 0 max = 0 min , equations (Dl) and (D3) imply 
that the minimum of J(a 0 ) is zero and it is reached if and 
only if a 0 = 0 O where (3 0 denotes the value of either 0 max or 
0min- Equations (53) and (54) obviously are valid for this 
special case. 
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